Identity and Reference in web-based Knowledge Representation (IR-KR)

نویسندگان

  • Wojciech M. Barczynski
  • Falk Brauer
  • Alexander Löser
  • Adrian Mocan
چکیده

Accurate information extraction program is a key prerequisite for the correct identification of entities on the Web, but their development is not at all a trivial task. Moreover, the maintenance, optimization, and customization of such programs require significant effort and resources. Recent trends in the Information Extraction (IE) research introduce the vision of algebraic information extraction. An important aspect of this vision is the declarative description of the extraction flows, outside of the monolithic IE program, by using a small set of generic operators. In our approach, we follow this vision and address the three main requirements that have never been addressed together in the same system before. First, we introduce a new methodology for efficient entity extraction from unstructured data, which involves the mapping of the extracted entities to already existing structured or semi-structured data. Second, we propose a set of operators addressing a comprehensive set of IE tasks, such as extracting atomic elements and aggregating them to complex real world objects, identifying relationships. Third we propose operators for leveraging global identifier providers, such as OKKAM. To verify our approach, we have implemented an information extraction system and evaluated our operators on a real example extraction flow for retrieving product information from forum pages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic Devices of Identity Representation in English Political Discourse with a Focus on Personal Pronouns: Power and Solidarity

The present study was aimed at exploring the use of pronominal reference for identity representation in terms of power and solidarity in English political discourse. The investigation was based on a corpus of four political interviews and debates amounting 26,500 words. The analysis was both qualitative and quantitative. In the qualitative analysis, a discourse-analytic approach was used to fin...

متن کامل

Survey of Temporal Knowledge Representation (Second Exam)

Knowledge Representation (KR) is a subfield within Artificial Intelligence that aims to represent, store, and retrieve knowledge in symbolic form that is easily manipulated using a computer. The vision of Semantic Web has recently increased because of the interest of using and a applying the Knowledge Representation methodology in both academia and industry. Knowledge Representation formalism a...

متن کامل

The Semantic Web: Webizing Knowledge Representation

The World Wide Web opens up new opportunities for the use of knowledge representation: a formal description of the semantic content of Web pages can allow better processing by computational agents. Further, the naming scheme of the Web, using Universal Resource Indicators, allows KR systems to avoid the ambiguities of natural language and to allow linking between semantic documents. These capab...

متن کامل

KRRT: Knowledge Representation and Reasoning Tutor System

Knowledge Representation & Reasoning (KR&R) is a fundamental topic in Artificial Intelligence. A basic KR language is First– Order Logic (FOL), the most representative logic–based representation language, which is part of almost any introductory AI course. In this work we present KRRT (Knowledge Representation & Reasoning Tutor). KRRT is a Web–based system which main goal is to help the student...

متن کامل

WebQR: Building a knowledge representation application on the Semantic Web

The Semantic Web (SW) was originally positioned as a combination of Knowledge Representation (KR) and the Web. However, most applications that use SW data today lean more towards the Information Retrieval spectrum. The reason for this is that traditional KR systems are designed to work with datasets that are small, curated, homogeneous, and application-specific. However, the SW is large-scale, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009